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DETAILED ACTION 



Response to Arguments 



1 . Applicants amended claims 1 , 9, 18, and 28 in the amendment received 
on 02/03/2003. The pending claims are 1-34. Applicants' arguments have been fully 
considered but they are not persuasive. 



As argued by applicants in page 12 of the amendment: 

Claims 1 and 18 recites a way of refining a node in a decision tree by selecting a feature of 
those that characterize the data associated with the node, then performing a cluster analysis along the 
selected feature, and then constructing arcs of the decision tree for each of the clusters. Thus, a cluster 
analysis is performed in refining a node in a decision tree, enabling the decision to be built "on the fly" 
(see Spec, p. 6). 

By contrast, Hall et al. does not show this way of building a decision tree, by performing a 
cluster analysis in refining a node of a decision tree. Rather, Hall et al. is directed to a method of 
developing of fuzzy rules from continuous valued data by building a decision tree in accordance with 
the C4.5 algorithm (Abstract, p. 1757, col. 1). 

However, Hall et al. recognize that the "C4.5 algorithm tree algorithm requires crisp class 
assignments for all objects. It is necessary to partition the continuous output values into a effect set of 
discrete output classes." (Section 2.1, p. 1758, col. 1, emphasis added). Accordingly, Hall et al. 
propose to preprocess the data first by applying a fuzzy c-means clustering to determine the discrete 
classes, and then feeding the discrete classes into the C4.5 algorithm: "After a discrete class has been 
created for each example, as discussed in Section 2.1, C4.5 may be used to create a decision tree." 
(Section 3, p. 1759, col. 1). 

Accordingly, Hall et al. fails to teach or suggest "performing a cluster analysis along the 
selected feature to group the data into one or more clusters" since whatever cluster analysis that is 
performed in Hall et al. is performed before building the decision tree, that is, without selecting a 
feature when refining a node of a decision tree. The remaining references, Shafer et al. and Choe et al., 
also fail to teach this aspect of claims 1-9 and 18-26. 

Examiner respectfully traverses because of these reasons: 



As shown in table 1 of page 1757, a training set from the domain of tennis to 
determine whether to play tennis based on the Weather: (Sunny, Cloudy), Wind (Windy, 
Quiet), Temperature (0, 100° F) and there are two outcomes (Play, Don't Play). Given 



the training example to C4.5, the simple decision tree in page 1758, FIG. 1 would be 
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produced. The decision tree allows the classification of examples into n classes (n = 2 
here) by choosing an attribute whose values may split the examples up into more 
homogeneous groups, and in this example, the attribute Temperature is chosen to 
associate with the root node (Decision trees from C4.5, page 1757) indicates the step of 
selecting a feature from among the features characterizing the data associated with the node. 
Hall further discloses that the attribute values of a continuous valued attribute are each 
examined as a possible attribute to split the example set of a node in a decision tree 
and a value in the data set is chosen as the "split point" (Decision trees from C4.5, 
pages 1757-1758). In here, the continuous valued attribute is temperature, and its 
values are examined in order to cluster the data into two groups, one with temperature < 
80 and one with temperature > 80. This technique indicates the step of performing a 
cluster analysis along the selected feature to group the data into one or more clusters. 

Examiner agrees with the applicants' argument that Hall et ai. is directed to a method of 

developing of fuzzy rules from continuous valued data by building a decision tree in accordance with the C4.5 

algorithm (Abstract, p. 1757, col. l). However, the technique of creating a decision tree as 
disclosed by Hall as discussed above is implemented by the C4.5 learning system 
before the fuzzy rules could be extracted from the decision tree (page 1757, 
Introduction). 

As argued by applicants in page 13 of the amendment: 

Hall et al, alone or in combination with Shafer et al and Choe et al, fail to teach or suggest the 
limitations of claims 2-3, 10-17, 19-20, and 27-34. For example, independent claims 10 and 27 recite: 
performing a plurality of cluster analyses along each of the features to calculate a maximal cluster 
validity measure, said maximal cluster validity measure corresponding to one of the features; 
selecting the one of the features corresponding to the maximal cluster validity measure; 
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Dependent claims 2 and 19 also affirmatively recite these limitations. None of the references show the 
recited "maximal cluster validity measure" calculated by performed a plurality of cluster analyses and 
selecting one of the features that corresponds to the maximal cluster validity measure. Moreover, 
claims 3, 11, 17, 20, 27, and 34 specify a specific kind of maximal cluster validity measure that based 
on the "partition coefficient." 

As explained above, Hall et al discloses a method of generating fuzzy rules from data by first 
performing a fuzzy cluster analysis to determine crisp, discrete classes for the data and then applying 
the C4.5 decision tree algorithm to the discrete classes. Since the C4.5 decision tree algorithm requires 
discrete classes, the C4.5 algorithms selects its features to build the decision tree based on the "highest 
information gain associated with it" (Section 2, p. 1758, col. l)-but not on a "maximal cluster validity 
measure" or a "partition coefficient" based on performing cluster analyses as recited in the claims. 



Examiner respectfully traverses because of these reasons: 
As disclosed by Hall, the training set is given to C4.5 as the decision tree 
learning system. The attribute values of a continuous valued attribute are each 
examined as a possible attribute to split the example set of a node in a decision tree. 
The selection of a specific value is based upon the information gain ratio associated 
with choosing that attribute. The attribute, which has the highest information gain, is 
chosen as the attributes for splitting the examples at a node (Decision trees from C4.5, 
pages 1757-1758). As claimed by the applicants, the maximal cluster validity measure 
as defined in claims 1 0 and 27 is just a variable that correspond to one of the features, and 
one of the features corresponding to the maximal cluster validity measure is selected for 
subdividing the data into one or more groups based on the selected feature. Thus, the 
maximum information gain still satisfies the condition of the claimed maximal cluster 
validity measure. In addition, the maximum information gain is chosen among the 
calculated information gain ratios as the partition coefficients. 



As argued by applicants in page 14 of the amendment with regards to claims 7-8, 
15, 24-25, and 32. Examiner respectfully traverses because Choe teaches a clustering 
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criterion based on an error tolerance e as a predetermined threshold by calculating a 
cluster center to update a fuzzy c-partition U, if the different between two consecutive U 
is less than or equal the error tolerance as a predetermined relationship, the data is 
grouped into the cluster (Choe, Fuzzy C-Means Algorithm, ALGORITHM 1 , Step 6). 
Thus, instead of the different between two consecutive U, a domain ratio could be used 
and still give the same result. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for 
all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

This application currently names joint inventors. In considering patentability of 

the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of 

the various claims was commonly owned at the time any inventions covered therein 

were made absent any evidence to the contrary. Applicant is advised of the obligation 

under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 

not commonly owned at the time a later invention was made in order for the examiner to 

consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or(g) 

prior art under 35 U.S.C. 1 03(a). 
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3. Claims 1-5, 9-13, 16-22, 26-30 and 33-34 are rejected under 35 U.S.C. 
103(a) as being unpatentable over Hall et al. [Generating Fuzzy Rules from Data]. 

Regarding to claims 1 and 18, Hall teaches a method of developing fuzzy rules 
from continuous valued data by exploiting the properties of decision trees, a crisp 
decision tree is created by creating a discrete set of fuzzy output classes and providing 
a set of training example to the decision tree learning system (abstract). Hall does not 
explicitly teach the Steps of selecting a feature from among the features characterizing the 
data associated with the node; performing a cluster analysis along the selected feature to 
group the data into one or more clusters; and constructing one or more arcs of the decision 
tree at the node respectively for each of the one or more clusters. However, as shown in 
table 1 , a training set from the domain of tennis to determine whether to play tennis 
based on the Weather: (Sunny, Cloudy), Wind (Windy, Quiet), Temperature (0, 100* F) 

and there are two outcomes (Play, Don't Play). The training set is given to C4.5 as the 
decision tree learning system. The decision tree allows the classification of examples 
into two classes and each class is associated with a node of the tree by choosing an 
attribute whose values may split the examples up into more homogeneous groups and 
as shown in FIG. 1, the attribute Temperature is chosen (Decision trees from C4.5, 
page 1 757) as the Step of selecting a feature from among the features characterizing the 
data associated with the node. Hall further discloses that the attribute values of a 
continuous valued attribute are each examined as a possible attribute to split the 
example set of a node in a decision tree and a value in the data set is chosen as the 
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"split point" (Decision trees from C4.5, pages 1757-1758) as the step of performing a 
cluster analysis along the selected feature to group the data into one or more clusters. As 
shown in FIG. 1 , the root node of the tree is labeled Temperature and indicates that 
the temperature of the weather is tested. The right arc that connects the Temperature 
node to leaf node No Play is labeled > 80 indicating that leaf node No Play is to be 
reached if the temperature is greater than 80. On the other hand, the left arc connects 
Temperature node to another branch node is labeled <= so indicating the branch 
node is to be reached if the temperature is <; 80. The branch node is labeled wind. 
Thus, the technique as shown in FIG. 1 indicates the step of constructing one or more 
arcs of the decision tree at the node respectively for each of the one or more clusters. 
Therefore, it would have been obvious for one of ordinary skill in the art at the time the 
invention was made to modify the Hall method of generating a decision tree by including 
the steps of selecting a feature, performing cluster analysis, constructing one or more 
arcs of the decision tree in order to classify records of unknown class. 

Regarding to claims 2 and 19, Hall teaches all the claimed subject matters as 
discussed in claims 1 and 18, Hall further discloses the steps of performing a plurality of 
cluster analyses along each of the features to calculate a maximal cluster validity measure, 
said maximal cluster validity measure corresponding to one of the features; and selecting the 
one of the features that corresponds to the maximal cluster validity measure (Decision trees 
from C4.5, pages 1757-1758). 
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Regarding to claims 3 and 20, Hall teaches all the claimed subject matters as 
discussed in claims 2 and 19, Hall further discloses the step: for each of the features, 
performing a plurality of cluster analyses along said each of the features for a plurality of 
cluster numbers to calculate respective partition coefficients; and determining the maximal 
cluster validity measure from among the partition coefficients (Decision trees from C4.5, 
pages 1757-1758). 

Regarding to claims 4 and 21 , Hall teaches all the claimed subject matters as 
discussed in claims 1 and 18, Hall further discloses the step of performing the cluster 
analysis includes the step of performing a fuzzy cluster analysis (Decision trees from C4.5, 
pages 1757-1758). 

Regarding to claims 5 and 22, Hall teaches all the claimed subject matters as 
discussed in claims 4 and 21 , Hall further discloses the step of performing the fuzzy 
cluster analysis includes the step of performing a fuzzy c-means analysis (Creating class 
labels and FCG, pages 1758-1760). 

Regarding to claims 9 and 26, Hall teaches all the claimed subject matters as 
discussed in claims 1 and 1 8, Hall further discloses the steps of projecting the data in 
each of the clusters, wherein the projected data are characterized by the plurality of the 
features but for the selected feature; and recursively performing the steps of selecting a 
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feature and performing the cluster analysis on the projected data in each of the clusters (FIG. 
1, Decision trees from C4.5, pages 1757-1758). 

Regarding to claims 10 and 27, Hall teaches a method of developing fuzzy rules 
from continuous valued data by exploiting the properties of decision trees; a crisp 
decision tree is created by creating a discrete set of fuzzy output classes and providing 
a set of training example to the decision tree learning system (abstract). Hall does not 
explicitly teach the steps of performing a plurality of cluster analysis along the selected 
feature to calculate a maximal cluster validity measure, said maximal cluster validity measure 
corresponding to one of the features; selecting the one of the features corresponding to the 
maximal cluster validity measure; subdividing the data into one or more groups based on the 
selected feature; and building the decision tree based on the one or more groups. However, 
as shown in table 1 , a training set from the domain of tennis to determine whether to 
play tennis based on the Weather: (Sunny, Cloudy), Wind (Windy, Quiet), Temperature 
(0, 100° F) and there are two outcomes (Play, Don't Play). The training set is given to 
C4.5 as the decision tree learning system. The decision tree allows the classification of 
examples into two classes by choosing an attribute whose values may split the 
examples up into more homogeneous groups. The attribute values of a continuous 
valued attribute are each examined as a possible attribute to split the example set of a 
node in a decision tree. The selection of a specific value is based upon the maximum 
information gain ratio associated with choosing that attribute. The attribute, which has 
the highest information gain associated with it is chosen as the attribute for splitting the 
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examples at a node (Decision trees from C4.5, pages 1757-1758 and FIG. 1). This 
technique indicates the step of performing a plurality of cluster analysis along the selected 
feature to calculate a maximal cluster validity measure, said maximal cluster validity measure 
corresponding to one of the features; selecting the one of the features corresponding to the 
maximal cluster validity measure. As shown in FIG. 1, based on the maximum information 
gain of 0.459 using 80 as the split point, the decision tree is produced as the step of 
subdividing the data into one or more groups based on the selected feature; and building the 
decision tree based on the one or more groups. Therefore, it would have been obvious for 
one of ordinary skill in the art at the time the invention was made to modify the Hall 
method of generating a decision tree by performing a cluster analysis, selecting the 
feature based on maximal cluster validity measure and subdividing the data in order to 
classify records of unknown class. 

Regarding to claims 1 1 and 28, Hall teaches all the claimed subject matters as 
discussed in claims 10 and 27, Hall further discloses the step: for each of the features, 
performing a plurality of cluster analyses along said each of the features for a plurality of 
cluster numbers to calculate respective partition coefficients; and determining the maximal 
cluster validity measure from among the partition coefficients (Decision trees from C4.5, 
pages 1757-1758). 

Regarding to claims 12 and 29, Hall teaches all the claimed subject matters as 
discussed in claims 10 and 27, Hall further discloses the step of performing the cluster 
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analysis includes the step of performing a fuzzy cluster analysis (Decision trees from C4.5, 
pages 1757-1758). 

Regarding to claims 13 and 30, Hall teaches all the claimed subject matters as 
discussed in claims 10 and 27, Hall further discloses the step of performing the fuzzy 
cluster analysis includes the step of performing a fuzzy c-means analysis (Creating class 
labels and FCG, pages 1758-1760). 

Regarding to claims 16 and 33, Hall teaches all the claimed subject matters as 
discussed in claims 10 and 27, Hall further discloses the steps of projecting the data in 
each of the group, wherein the projected data are characterized by the plurality of the features 
but for the selected feature; and recursively performing the steps of selecting a feature, 
comprising selecting a new one of the features corresponding to a new maximal partition 
coefficient and subdividing the data into one or more new groups based on the selected new 
feature (FIG. 1, Decision trees from C4.5, pages 1757-1758). 

Regarding to claims 17 and 34, Hall teaches a method of developing fuzzy rules 
from continuous valued data by exploiting the properties of decision trees; a crisp 
decision tree is created by creating a discrete set of fuzzy output classes and providing 
a set of training example to the decision tree learning system (abstract). As shown in 
table 1 , a training set from the domain of tennis to determine whether to play tennis 
based on the Weather: (Sunny, Cloudy), Wind (Windy, Quiet), Temperature (0\ 100T) 
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and there are two outcomes (Play, Don't Play). The training set is given to C4.5 as the 
decision tree learning system. The decision tree allows the classification of examples 
into two classes by choosing an attribute whose values may split the examples up into 
more homogeneous groups. The attribute values of a continuous valued attribute are 
each examined as a possible attribute to split the example set of a node in a decision 
tree. The selection of a specific value is based upon the maximum information gain or 
maximal partition coefficient (Decision trees from C4.5, pages 1757-1758) as the step of 
performing a plurality of fuzzy cluster analysis along each of the features to calculate a 
maximal partition coefficient and a corresponding set of one or more fuzzy clusters, said 
maximal partition coefficient corresponding to one of the features. As illustrated in FIG. 1 , 
the attribute Temperature is chosen with the maximum information gain of 0.459 as the 
step of selecting the one of the features corresponding to the maximal partition coefficient. 
As shown in FIG. 1 , based on the maximum information gain of 0.459 using 80 as the 
split point, the decision tree is produced as the step of building the decision tree based on 
the corresponding set of one or more fuzzy clusters. Therefore, it would have been obvious 
for one of ordinary skill in the art at the time the invention was made to modify the Hall 
method of generating a decision tree by performing a fuzzy cluster analysis and 
selecting the feature based on maximal partition coefficient and in order to classify 
records of unknown class. 
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4. Claims 6, 14, 23 and 31 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Hall et al. [Generating Fuzzy Rules from Data] in view of Shafer 
et al. [SPRINT: A Scalable Parallel Classifier for Data Mining]. 

Regarding to claims 6, 14, 23 and 31 , Hall teaches all the claimed subject 
matters as discussed in claims 1 , 10, 18 and 27, but fails to disclose the step of 
performing the cluster analysis includes the step of performing a hard cluster analysis. Shafer 
teaches a method of forming a decision tree by performing a hard cluster analysis 
(Shafer, SPRINT: A scalable Parallel Classifier for Data Mining, pages 544-550, 
especially Abstract and Introduction pages 544-545). Therefore, it would have been 
obvious for one of ordinary skill in the art at the time the invention was made to modify 
the Hall method by including the technique of hard cluster analysis in order to optimize 
the system by using a regular cluster for classifying records of unknown class. 

5. Claims 7-8, 15, 24-25 and 32 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Hall et al. [Generating Fuzzy Rules from Data] in view of 
Choe et al. [On the Optimal Choice of Parameters in a Fuzzy C-Means Algorithm]. 

Regarding to claims 7, 15, 24 and 32, Hall teaches all the claimed subject 
matters as discussed in claims 1, 10, 18 and 27, but fails to disclose the steps of 
calculating a domain ratio of a difference in domains limits of the data over a difference in 
domain limits of a superset of the data; determining whether the domain ratio has a 
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predetermined relationship with a predetermined threshold; and if the domain ratio has the 
predetermined relationship with the predetermined threshold, then grouping the data into a 
single cluster, Choe teaches a clustering criterion based on an error tolerance 6 as a 
predetermined threshold by calculating a cluster center to update a fuzzy c-partition U, if 
the different between two consecutive U is less than or equal the error tolerance as a 
predetermined relationship, the data is grouped into the cluster (Choe, Fuzzy C-Means 
Algorithm, ALGORITHM 1 , Step 6). Thus, instead of the different between two 
consecutive U, a domain ratio could be used and still give the same result. Therefore, it 
would have been obvious for one of ordinary skill in the art at the time the invention was 
made to modify the Hall method by using a domain ration in order to cluster data in a 
finite set. 

Regarding to claims 8 and 25, Hall and Choe teaches all the claimed subject 
matters as discussed in claims 7 and 24, Choe further discloses the step of determining 
whether the domain ratio is less than the predetermined threshold (Choe, Fuzzy C-Means 
Algorithm, ALGORITHM 1, Step 6). 

Conclusion 

6. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of 
time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
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TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Hung Pham whose telephone number is 703-605 4242. 
The examiner can normally be reached on Monday-Friday, 7:00 Am - 3:30 Pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, VU, KIM YEN can be reached on 703-305 4393. The fax phone numbers 
for the organization where this application or proceeding is assigned are 703-746 7239 
for regular communications and 703-746 7238 for After Final communications. 
Any inquiry of a general nature or relating to the status of this application or proceeding 
should be directed to the receptionist whose telephone number is 703-305 3900. 

Examiner: Hung Pham 
March 19, 2003 

SU*W1S0RY PATENT EXAMINER 
Tft^yOLOSY CENTER 2100 




